Linguistic resources such as part-of-speech (POS) tags have been extensivelyused in statistical machine translation (SMT) frameworks and have yieldedbetter performances. However, usage of such linguistic annotations in neuralmachine translation (NMT) systems has been left under-explored. In this work, we show that multi-task learning is a successful and a easyapproach to introduce an additional knowledge into an end-to-end neuralattentional model. By jointly training several natural language processing(NLP) tasks in one system, we are able to leverage common information andimprove the performance of the individual task. We analyze the impact of three design decisions in multi-task learning: thetasks used in training, the training schedule, and the degree of parametersharing across the tasks, which is defined by the network architecture. Theexperiments are conducted for an German to English translation task. Asadditional linguistic resources, we exploit POS information and named-entities(NE). Experiments show that the translation quality can be improved by up to1.5 BLEU points under the low-resource condition. The performance of the POStagger is also improved using the multi-task learning scheme.
展开▼